Binaural audio generation via multi-task learning
نویسندگان
چکیده
We present a learning-based approach for generating binaural audio from mono using multi-task learning. Our formulation leverages additional information two related tasks: the generation task and flipped classification task. learning model extracts spatialization features visual input, predicts left right channels, judges whether channels are flipped. First, we extract ResNet video frames. Next, perform separate subnetworks based on features. method optimizes overall loss weighted sum of losses tasks. train evaluate our FAIR-Play dataset YouTube-ASMR dataset. quantitative qualitative evaluations to demonstrate benefits over prior techniques.
منابع مشابه
Image Cosegmentation via Multi-task Learning
Image segmentation has been studied in computer vision for many years and yet it remains a challenging task. One major difficulty arises from the diversity of the foreground, which often results in ambiguity of backgroundforeground separation, especially when prior knowledge is missing. To overcome this difficulty, cosegmentation methods were proposed, where a set of images sharing some common ...
متن کاملMulti-Task Learning via Conic Programming
When we have several related tasks, solving them simultaneously is shown to be more effective than solving them individually. This approach is called multi-task learning (MTL) and has been studied extensively. Existing approaches to MTL often treat all the tasks as uniformly related to each other and the relatedness of the tasks is controlled globally. For this reason, the existing methods can ...
متن کاملInterest Inference via Structure-Constrained Multi-Source Multi-Task Learning
User interest inference from social networks is a fundamental problem to many applications. It usually exhibits dual-heterogeneities: a user’s interests are complementarily and comprehensively reflected by multiple social networks; interests are inter-correlated in a nonuniform way rather than independent to each other. Although great success has been achieved by previous approaches, few of the...
متن کاملSemantic Segmentation via Multi-task, Multi-domain Learning
We present an approach that leverages multiple datasets possibly annotated using different classes to improve the semantic segmentation accuracy on each individual dataset. We propose a new selective loss function that can be integrated into deep networks to exploit training data coming from multiple datasets with possibly different tasks (e.g., different label-sets). We show how the gradient-r...
متن کاملMulti-task Learning via Non-sparse Multiple Kernel Learning
In object classification tasks from digital photographs, multiple categories are considered for annotation. Some of these visual concepts may have semantic relations and can appear simultaneously in images. Although taxonomical relations and co-occurrence structures between object categories have been studied, it is not easy to use such information to enhance performance of object classificatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Graphics
سال: 2021
ISSN: ['0730-0301', '1557-7368']
DOI: https://doi.org/10.1145/3478513.3480560